A Sound Polymorphic Type System for a Dialect of C

نویسندگان

Geoffrey Smith

Dennis M. Volpano

چکیده

ion that represents a function with formal parameters x1; : : : ; xn and body e. One might expect that addresses would just be natural numbers, but that would not allow the semantics to detect invalid pointer arithmetic. So instead an address is a pair of natural numbers (i; j) where i is the segment number and j is the o set . Intuitively, we put each variable or array into its own segment. Thus a simple variable has address (i; 0), and an n-element array has addresses (i; 0); (i; 1); : : : ; (i; n 1). Pointer arithmetic involves only the o set of an address, and dereferencing nonexistent or dangling pointers is detected as a \segmentation fault". Next we identify the set of values v, consisting of literals, pointers, and lambda abstractions: v ::= c j (a; 0) j x1; : : : ; xn: e The result of a successful evaluation is always a value. Finally, we require the notion of a memory. A memory is a nite function mapping addresses to values, or to the special results dead and uninit. These results indicate that the cell with that address has been deallocated or is uninitialized, respectively. We write (a) for the contents of address a 2 dom( ), and we write [a := v] for the memory that assigns value v to address a, and value (a0) to any address a0 other than a. Note that [a := v] is an update of if a 2 dom( ) and an extension of if a 62 dom( ). We can now de ne the evaluation relation ` e) v; 0 which asserts that evaluating closed expression e in memory results in value v and new memory 0. The evaluation rules are given in Figures 3 and 4. We write [e0=x]e to denote the capture-avoiding substitution of e0 for all free occurrences of x in e. Note the use of substitution in rules (bindvar), (bindarr), (bindfun), and (apply). It allows us to avoid environments and closures in the semantics, so that the result of evaluating a Polymorphic C expression is just another Polymorphic C expression. This is made possible by the exible syntax of the language and the fact that only closed expressions are ever evaluated during the evaluation of a closed expression. We remark that rule (apply) speci es that function arguments are evaluated left to right; C leaves the evaluation order unspeci ed. Also, note that if there were no & operator, there would be no need to specify in rule (bindvar) that 10 (val) ` v ) v; (contents) a 2 dom( ) and (a) is a value ` (a; 1)) (a); (deref) ` e) (a; 0); 0 a 2 dom( 0) and 0(a) is a value ` *e) 0(a); 0 (ref) ` &(a; 1)) (a; 0); ` e) (a; 0); 0 ` &*e) (a; 0); 0 (offset) ` e1 ) ((i; j); 0); 1 1 ` e2 ) n; 0 (n an integer) ` e1+e2 ) ((i; j + n); 0); 0 (update) ` e) v; 0 a 2 dom( 0) and 0(a) 6= dead ` (a; 1)=e) v; 0[a := v] ` e1 ) (a; 0); 1 1 ` e2 ) v; 2 a 2 dom( 2) and 2(a) 6= dead ` *e1=e2 ) v; 2[a := v] (sequence) ` e1 ) v1; 1 1 ` e2 ) v2; 2 ` e1;e2 ) v2; 2 (branch) ` e1 ) n; 1 (n a nonzero integer) 1 ` e2 ) v; 0 ` if (e1) fe2g else fe3g ) v; 0 ` e1 ) 0; 1 1 ` e3 ) v; 0 ` if (e1) fe2g else fe3g ) v; 0 Fig. 3. The Evaluation Rules (Part 1) 11 (loop) ` e1 ) 0; 1 ` while (e1) fe2g ) unit; 1 ` e1 ) n; 1 (n a nonzero integer) 1 ` e2 ) v; 2 2 ` while (e1) fe2g ) unit; 0 ` while (e1) fe2g ) unit; 0 (bindvar) ` e1 ) v1; 1 (i; 0) 62 dom( 1) 1[(i; 0) := v1] ` [((i; 0); 1)=x]e2 ) v2; 2 ` var x = e1; e2 ) v2; 2[(i; 0) := dead] (bindarr) ` e1 ) n; 1 (n a positive integer) (i; 0) 62 dom( 1) 1[(i; 0); : : : ; (i; n 1) := uninit; : : : ;uninit] ` [((i; 0); 0)=x]e2 ) v2; 2 ` arr x[e1]; e2 ) v2; 2[(i; 0); : : : ; (i; n 1) := dead; : : : ;dead] (bindfun) ` [ x1; : : : ; xn: e=x]e0 ) v; 0 ` x(x1,: : :,xn) feg e0 ) v; 0 (apply) ` e) x1; : : : ; xn: e0; 1 1 ` e1 ) v1; 2 n ` en ) vn; n+1 n+1 ` [v1; : : : ; vn=x1; : : : ; xn]e0 ) v; 0 ` e(e1, : : : ,en)) v; 0 Fig. 4. The Evaluation Rules (Part 2) a variable dies at the end of its scope; it would simply become unreachable at that point (and its storage could be reused). Note that a successful evaluation always produces a value and a memory: Lemma 1 If ` e) v; 0, then v is a value and 0 is a memory. PROOF. By induction on the structure of the derivation. 2 12 4 Type Preservation We now turn to the question of the soundness of our type system. We begin in this section by using the framework of Harper [8] to prove that our type system satis es the type preservation property (sometimes called the subject reduction property). This property basically asserts that types are preserved across evaluations; that is, if an expression of type evaluates successfully, it produces a value of type . But before we can do this, we need to extend our typing rules so that we can type the semantic values (variables, pointers, and lambda abstractions) introduced in Section 3.2. Typing a variable (a; 1) or a pointer (a; 0) clearly requires information about the type of value stored at address a; this information is provided by an address typing . One might expect an address typing to map addresses to data types. This turns out not to work, however, because a well-typed program can produce as its value a nonexistent pointer, and such pointers must therefore be typable if type preservation is to hold. For example, the program arr a[10]; a+17 is well typed and evaluates to ((0; 17); 0), a nonexistent pointer. This leads us to de ne an address typing to be a nite function mapping segment numbers to data types. The notational conventions for address typings are like those for identi er typings. We now modify our typing judgments to include an address typing: ; ` e : All of the rules given previously in Figures 1 and 2 need to be extended to include address typings, and we also add the new typing rules given in Figure 5. Furthermore, Figure 5 includes an updated version of rule (fun) from Figure 2. In addition to including an address typing , the new rule replaces Close with Close ; , which does not generalize type variables that are free in either or in . To prove the type preservation theorem, we require a number of lemmas that establish some useful properties of the type system. We begin with a basic lemma that shows that our type system types closed values reasonably|it shows that any closed value of some type has the form that one would expect. It also shows that a closed expression of type var can have only two possible forms. (Note that ; here denotes an empty identi er typing.) 13 (var) ; ` ((i; j); 1) : var if (i) = (ptr) ; ` ((i; j); 0) : ptr if (i) = (!-intro) ; [x1 : 1; : : : ; xn : n] ` e : ; ` x1; : : : ; xn: e : 1 n ! (fun) ; [x1 : 1; : : : ; xn : n] ` e : ; [x : Close ; ( 1 n ! )] ` e0 : 0 ; ` x(x1,: : :,xn) feg e0 : 0 Fig. 5. New Rules for Typing Semantic Values Lemma 2 (Correct Forms) Suppose ; ; ` v : . Then { if is int, then v is an integer literal, { if is unit , then v is unit, { if is 0 ptr , then v is of the form ((i; j); 0), and { if is 1 n ! 0, then v is of the form x1; : : : ; xn:e. And if ; ; ` e : var, then e is of the form ((i; j); 1) or of the form *e0. PROOF. Immediate from inspection of the typing rules. (Note that the last part of the lemma assumes that array subscripting is syntactic sugar.) 2 A consequence of the last part of this lemma is that if ; ; ` e : and e is not of the form ((i; j); 1) or *e0, then the typing derivation cannot end with rule (r-val). So the typing rules, for the most part, remain syntax directed. The fact that variables can have only two possible forms is also exploited in our evaluation rules, speci cally within rules (ref) and (update) of Figure 3. In particular, we are able to de ne the semantics of = and & without de ning an auxiliary relation for evaluation in \L-value" contexts; contrast our rules with those given in [3]. We continue with some basic lemmas showing that typings are preserved under substitutions and under extensions to the address and identi er typings: Lemma 3 (Type Substitution) If ; ` e : , then for any substitution S, S ;S ` e : S , and the latter typing has a derivation no higher than the former. PROOF. By induction on the structure of the derivation of ; ` e : . 2 14 Lemma 4 (Super uousness) Suppose that ; ` e : . If i 62 dom( ), then [i : 0]; ` e : , and if x 62 dom( ), then ; [x : ] ` e : . PROOF. By induction on the height of the derivation of ; ` e : . The only way that adding an extra assumption can cause problems is by adding more free type variables to or , thereby preventing Close from generalizing such variables in (fun) steps. If this happens, we must rename such variables in the original derivation before adding the extra assumption. By the Type Substitution Lemma, we can do this renaming and the height of the derivation is not increased. 2 Lemma 5 (Substitution) If ; ` e : and ; [x : ] ` e0 : , then ; ` [e=x]e0 : . PROOF. Assume that the bound identi ers of e0 are renamed as necessary to ensure that no identi er occurring in e occurs bound in e0. Then at every use of (ident) or (var-id) on x in the derivation of ; [x : ] ` e0 : , we can splice in the appropriate derivation for e. There may be extra assumptions around at that point, but by the Super uousness Lemma, they do not cause problems. 2 Lemma 6 (8-intro) If ; ` e : and 1; : : : ; n do not occur free in or in , then ; ` e : 8 1; : : : ; n : . PROOF. This lemma is a simple corollary to the Type Substitution Lemma. Suppose that 8 : 0. Then there exists a substitution S = [ = ] such that S = 0. By the Type Substitution Lemma, S ;S ` e : S . Hence, since the are not free in or in , ; ` e : 0. 2 We now return to type preservation. Roughly speaking, we wish to show that if closed program e has type under address typing , and evaluates under memory to v, then v also has type . But since e can allocate addresses and these can occur in v, we cannot show that v has type under |we can only show that v has type under some address typing 0 that extends . (We denote \ 0 extends " by 0.) Also, we need to assume that is consistent with |for example, if (i) = int , then needs to store integers in segment i. Precisely, we de ne : if (i) dom( ) = fi j (i; 0) 2 dom( )g, and 15 (ii) for all (i; j) such that ((i; j)) is a value, ` ((i; j)) : (i). Note that must give a type to uninitialized and dead addresses of , but the type can be anything. We can now prove the type preservation theorem: Theorem 7 (Type Preservation) If ` e ) v; 0, ; ; ` e : , and : , then there exists 0 such that 0, 0 : 0, and 0; ; ` v : . PROOF. By induction on the structure of the derivation of ` e ) v; 0. Here we just show the (bindvar) and (bindfun) cases; the remaining cases are similar. (bindvar). The evaluation must end with ` e1 ) v1; 1 (i; 0) 62 dom( 1) 1[(i; 0) := v1] ` [((i; 0); 1)=x]e2 ) v2; 2 ` var x = e1; e2 ) v2; 2[(i; 0) := dead] while the typing must end with (letvar): ; ; ` e1 : 1 ; [x : 1 var ] ` e2 : 2 ; ; ` var x = e1; e2 : 2 and : . By induction, there exists 1 such that 1, 1 : 1, and 1; ; ` v1 : 1. Since 1 : 1 and (i; 0) 62 dom( 1), also i 62 dom( 1). So 1 1[i : 1]. By rule (var), 1[i : 1]; ; ` ((i; 0); 1) : 1 var and by Lemma 4, 1[i : 1]; [x : 1 var ] ` e2 : 2 So we can apply Lemma 5 to get 1[i : 1]; ; ` [((i; 0); 1)=x]e2 : 2 Also, 1[(i; 0) := v1] : 1[i : 1]. So by a second use of induction, there exists 0 such that 1[i : 1] 0, 2 : 0, and 0; ; ` v2 : 2. 16 It only remains to show that 2[(i; 0) := dead] : 0. But this follows immediately from 2 : 0. Remark 8 What would go wrong if we simply removed the deallocated address (i; 0) from the domain of the nal memory, rather than marking it dead? Well, with the current de nition of : , we would then be forced to remove i from the nal address typing. But then 2 i : 0 i would fail, if there were any dangling pointers ((i; j); 0) in the range of 2 i. If, instead, we allowed 0 to retain the typing for i, then the next time that (i; 0) were allocated we would have to change the typing for i, rather than extend the address typing. (bindfun). The evaluation must end with ` [ x1; : : : ; xn: e=x]e0 ) v; 0 ` x(x1,: : :,xn) feg e0 ) v; 0 while the typing must end with (fun): ; [x1 : 1; : : : ; xn : n] ` e : ; [x : Close ;;( 1 n ! )] ` e0 : 0 ; ; ` x(x1,: : :,xn) feg e0 : 0 and : . By rule (!-intro), ; ; ` x1; : : : ; xn: e : 1 n ! and so by Lemma 6, ; ; ` x1; : : : ; xn: e : Close ;;( 1 n ! ) Therefore, by Lemma 5, ; ; ` [ x1; : : : ; xn: e=x]e0 : 0. So by induction, there exists 0 such that 0, 0 : 0, and 0 ` v : 0. 2 5 Type Soundness The type preservation property does not by itself ensure that a type system is sensible. For example, a type system that assigns every type to every expression trivially satis es the type preservation property, even though such a type system is useless. The main limitation of type preservation is that it only 17 applies to well-typed expressions that evaluate successfully. Really we would like to be able to say something about what happens when we attempt to evaluate an arbitrary well-typed expression. One approach to strengthening type preservation (used by Gunter [6] and Harper [9], for example) is to augment the natural semantics with rules specifying that certain expressions evaluate to a special value, TypeError, which has no type. For example, an attempt to dereference a value other than a pointer would evaluate to TypeError. Then, by showing that type preservation holds for the augmented evaluation rules, we get that a well-typed expression cannot evaluate to TypeError. Hence any of the errors that lead to TypeError cannot occur in the evaluation of a well-typed expression. A drawback to this approach is the need to augment the natural semantics. But, more seriously, this approach does not give us as much information as we would like. It tells us that certain errors will not arise during the evaluation of well-typed expression, but it leaves open the possibility that there are other errors that we have neglected to check for in the augmented natural semantics. Another approach is to use a di erent form of semantics than natural semantics. This is the approach advocated by Wright and Felleisen [25], who use a small-step structured operational semantics to prove type soundness for a number of extensions of ML. However, we nd natural semantics to be much more natural and appealing than small-step structured operational semantics, particularly for languages with variables that have bounded lifetimes. (For example, in Ozgen's proposed small-step semantics for Polymorphic C [16], quite subtle mechanisms are employed to deallocate cells at the correct time.) Gunter and Remy [7] also propose an alternative to natural semantics, which they call partial proof semantics. What we propose here is di erent. We argue that one can show a good type soundness theorem for a language, like Polymorphic C, de ned using natural semantics. The trouble with natural semantics is that it de nes only complete program executions, which are represented by derivation trees. But for a good type soundness theorem, we need a notion of an attempted execution of a program, which may of course fail in various ways. We argue, however, that a natural semantics gives rise in a natural way to a transition semantics, which we call a natural transition semantics, that provides the needed notion of an attempted program execution. 3 The basic idea is that a program execution is a sequence of partial derivation trees, that may or may not eventually reach a complete derivation tree. In a partial derivation tree, some of the nodes may be labeled with pending judgments, which represent expressions that need to be evaluated in the program 3 See [23] for a slightly di erent formulation of natural transition semantics; there, natural transition semantics is applied to a problem of computer security. 18 execution. A pending judgment is of the form ` e)?. (In contrast, we refer to ordinary judgments ` e) v; 0 as complete judgments.) Before we de ne partial derivation trees precisely, we need to make a few comments about the evaluation rules in a natural semantics. First, note that natural semantics rules are actually rule schemas, whose metavariables are instantiated in any use of the rule. Second, note that the hypotheses of each rule are either evaluation judgments ` e) v; 0 or boolean conditions, such as the condition a 2 dom( ) in rule (contents). (Such boolean conditions are regarded as complete judgements.) Finally, note that in some hypotheses an evaluation judgment includes an implicit boolean condition. For example, the rst hypothesis of rule (deref) is ` e) (a; 0); 0 This hypothesis is really an abbreviation for two hypotheses: ` e) v; 0 and v is of the form (a; 0) Assume henceforth that we use the unabbreviated forms in derivation trees. We want partial derivation trees to be limited to the trees that can arise in a systematic attempt to build a complete derivation tree; this constrains the form that such a tree can have. Precisely, De nition 9 A tree T whose nodes are labeled with (partial or complete) judgments is a partial derivation tree if it satis es the following two conditions: (i) If a node in T is labeled with a complete judgment J, then the subtree rooted at that node is a complete derivation tree for J. (ii) If a node in T is labeled with a pending judgment ` e)? and the node has k children, where k > 0, then there is an instance of an evaluation rule that has the form J1 J2 : : : Jn ` e) v; 0 where n k, and the labels on the children are J1; J2; : : : ; Jk, respectively, with possibly one exception: if Jk is k ` ek ) vk; 0k, then the kth child may alternatively be labeled with the pending judgment k ` ek )?. 19 One may readily see that a partial derivation tree can have at most one pending judgment on each level, which must be the rightmost node of the level, and whose parent must also be a pending judgment. Next we de ne transitions, based on the rules of the natural semantics, that describe how one partial derivation tree can be transformed into another. Suppose that there is an instance of an evaluation rule that has the form J1 J2 : : : Jn ` e) v; 0 where each hypothesis Ji is either an evaluation judgment i ` ei ) vi; 0i or else a boolean condition Bi. The transformations resulting from this rule are de ned as follows: Suppose that a partial derivation tree T contains a node N labeled with the pending judgment ` e )? and that the children of N are labeled with the complete judgments J1; J2; : : : ; Jk where 0 k. { Suppose k < n. Then if Jk+1 is of the form k+1 ` ek+1 ) vk+1; 0k+1, we can transform T by adding another child to N , labeled with the pending judgment k+1 ` ek+1 )?. And if Jk+1 is a boolean condition Bk+1 that is true, we can transform T by adding another child to N , labeled with Bk+1. { Now suppose k = n. Then we can transform T by replacing the label on N with the complete judgement ` e) v; 0. We write T ! T 0 if partial derivation tree T can be transformed in one step to T 0. As usual, ! denotes the re exive, transitive closure of !. Remark 10 We remark that, in the case of Polymorphic C, the transformation relation thus de ned is almost deterministic. In particular, although there are two evaluation rules for if (e1) fe2g else fe3g and while (e1) fe2g, there is no ambiguity, since we need not choose which rule is being applied until after the guard e1 has been evaluated. The only nondeterminism in the transformation relation is in rules (bindvar) and (bindarr). The second hypothesis of both rules is (i; 0) 62 dom( 1), and here metavariable i is not bound deterministically. But, of course, this nondeterministic choice of an address for a newly-allocated variable or array is of no importance. 2 A key property of ! is that it always transforms a partial derivation tree into another partial derivation tree: Lemma 11 If T is a partial derivation tree and T ! T 0, then T 0 is also a partial derivation tree. 20 PROOF. Straightforward. 2 The transformation rules give us the desired notion of program execution: to execute e in memory , we start with the tree T0 which consists of a single root node labeled with the pending judgment ` e )?, and then we apply the transformations, generating a sequence of partial derivation trees: T0 ! T1 ! T2 ! T3 ! More precisely, we de ne an execution of program e in memory to be a possibly in nite sequence of partial derivation trees T0; T1; T2; : : : such that { T0 is a one-node tree labeled with ` e)?, { for all i 0, Ti ! Ti+1 (unless Ti is the last tree in the sequence), and { if the sequence has a last tree Tn, then there is no tree T such that Tn ! T . Note that there are three possible outcomes to an execution: (i) The sequence ends with a complete derivation tree. This is a successful execution. (ii) The sequence is in nite. This is a nonterminating execution. (iii) The sequence ends with a tree Tn that contains a pending judgment but has no successor. This is an aborted execution. Our Type Soundness theorem will show that, for well-typed programs, aborted execution can arise only from one of a speci c set of errors. But rst, we argue that our notion of execution is correct. Let us write [J ] to denote the one-node tree labeled with J . The soundness of our notion of execution is given by the following lemma. Lemma 12 If [ ` e )?] ! T , where T contains no pending judgments, then T is a complete derivation tree for a judgment of the form ` e) v; 0. PROOF. By Lemma 11, T is a partial derivation tree. So, since T contains no pending judgments, T is a complete derivation tree for the judgment that labels its root. And this judgment must be of the form ` e) v; 0, because the initial tree has a root labeled with ` e )? and (as can be seen by inspecting the de nition of !) the only transformation that changes the label on a node changes a label of the form ` e )? to a label of the form ` e) v; 0. 2 21 Next we show that our notion of execution is complete: Lemma 13 If ` e) v; 0 and T is a complete derivation tree for ` e) v; 0, then [ ` e)?] ! T . PROOF. By induction on the structure of the derivation of ` e) v; 0. 2 Remark 14 This lemma shows that if ` e) v; 0, then there is a successful execution of e in . But it does not show that every execution of e in is successful. With an arbitrary natural semantics, this need not be so. For example, in a language with a nondeterministic choice operator, some executions of e in may be successful, others may be nonterminating, and others may abort. But in Polymorphic C, since ! is essentially deterministic, a stronger result should hold. 2 Now that we have a notion of program execution, we again turn to Polymorphic C and consider what we can say about the executions of well-typed Polymorphic C programs. De nition 15 A pending judgment ` e)? is well typed i there exist an address typing and a type such that : and ; ; ` e : . Also, a partial derivation tree T is well typed i every pending judgment in it is well typed. Roughly speaking, the combination of the Type Preservation theorem and the Correct Forms lemma (Lemma 2) allows us to characterize the forms of expressions that will be encountered during the execution of a well-typed program. This allows us to characterize what can go wrong during the execution. Here is the key type soundness result: Theorem 16 (Progress) Let T be a well-typed partial derivation tree that contains at least one pending judgment. If T ! T 0, then T 0 is well typed. Furthermore, there exists T 0 such that T ! T 0, unless T contains one of the following errors: E1. A read or write to a dead address. E2. A read or write to an address with an invalid o set. E3. A read of an uninitialized address. E4. A declaration of an array of size 0 or less. PROOF. Let N be the uppermost node in T that is labeled with a pending judgment, say ` e )?. Then any transformation on T must occur at this node. We just consider all possible forms of expression e. Here we just give the case e1=e2; the other cases are quite similar. 22 Since T is well typed, the pending judgment ` e1=e2 )? is well typed, and so there exist and such that : and ; ; ` e1=e2 : . The latter typing must be by (assign): ; ; ` e1 : var ; ; ` e2 : ; ; ` e1=e2 : By the Correct Forms lemma, e1 must be of the form ((i; j); 1) or else of the form *e01. So, simplifying notation a bit, the pending judgment that labels N has the form ` (a; 1)=e )? or ` *e1=e2 )?. We consider these two cases in turn. If the label of N is ` (a; 1)=e )?, where : and ; ; ` (a; 1)=e : , then the typing must end with (assign): ; ; ` (a; 1) : var ; ; ` e : ; ; ` (a; 1)=e : So by (var), a is of the form (i; j), where (i) = . Now, if N has no children, then (using rule (update)), we can transform T by adding to N a new child, labeled with the pending judgment ` e )?. Furthermore, this is the only possible transformation, and since ; ; ` e : , this new pending judgment is well typed. If N has exactly one child, then by condition (ii) of the de nition of partial derivation tree and the fact that N is the uppermost node labeled with a pending judgment, it must be that the child of N is labeled with a judgment of the form ` e) v; 0. In this case, we may transform T by adding a new child to N labeled with the boolean condition a 2 dom( 0) and 0(a) 6= dead provided that this condition is true. Now, by the Type Preservation theorem, there exists 0 such that 0, 0 : 0, and 0; ; ` v : . Hence 0(i) = , and so (i; 0) 2 dom( 0). So if (i; j) 62 dom( 0), then T contains error E2, a write to an address with an invalid o set j. And if 0((i; j)) = dead, then T contains error E1, a write to a dead address. Hence we can transform T unless it contains error E2 or E1. 23 Finally, if N has two children, then they must be labeled with the hypotheses of rule (update), and so we can transform T by replacing the label of N with ` (a; 1)=e) v; 0[a := v]. If the label of N is ` *e1=e2 )?, where : and ; ; ` *e1=e2 : , then the typing must end with (l-val) followed by (assign): ; ; ` e1 : ptr ; ; ` *e1 : var ; ; ` e2 : ; ; ` *e1=e2 : Now, if N has no children, then the only applicable transformation (using rule (update)) is to add to N a new child, labeled with the pending judgment ` e1 )?. Since ; ; ` e1 : ptr , this new pending judgment is well typed. If N has exactly one child, then by condition (ii) of the de nition of partial derivation tree and the fact that N is the uppermost node labeled with a pending judgment, it must be that the child of N is labeled with a judgment of the form ` e1 ) v1; 1. By the Type Preservation theorem, there exists 1 such that 1, 1 : 1, and 1; ; ` v1 : ptr . So by the Correct Form lemma, v1 is of the form ((i; j); 0). Hence, we may transform T by adding a new child to N labeled with the boolean condition v1 is of the form (a; 0); since this is guaranteed to be true. Also, by (ptr), 1(i) = . If N has two children, then we can transform T by adding a new child labeled with the pending judgment 1 ` e2 )?. By the Super uousness Lemma, 1; ; ` e2 : , so this pending judgment is well typed. If N has three children, then the third child of N must be labeled with a judgment of the form 1 ` e2 ) v; 2. In this case, we may transform T by adding a new child to N labeled with the boolean condition a 2 dom( 2) and 2(a) 6= dead provided that this condition is true. As before, by the Type Preservation theorem, there exists 0 such that 1 0, 24 2 : 0, and 0; ; ` v : . Hence 0(i) = , and so (i; 0) 2 dom( 2). So if (i; j) 62 dom( 2), then T contains error E2, a write to an address with an invalid o set j. And if 2((i; j)) = dead, then T contains error E1, a write to a dead address. Hence we can transform T unless it contains error E2 or E1. Finally, if N has four children, then they must be labeled with the hypotheses of rule (update), and so we can transform T by replacing the label of N with ` *e1=e2 ) v; 2[a := v]. 2 The Progress theorem gives our Type Soundness result as a simple corollary: Corollary 17 (Type Soundness) If ; ; ` e : and : , then any execution of e in either (i) succeeds, (ii) does not terminate, or (iii) aborts due to one of the errors E1, E2, E3, or E4. PROOF. Let T0 ! T1 ! T2 ! be an execution of e in . Then T0 = [ ` e )?], which is well typed by assumption. So, by the Progress theorem, every Ti is well typed, and furthermore, if Ti contains a pending judgment, then it has a successor unless it contains one of the errors E1, E2, E3, or E4. So, if the execution is nite, it either ends with a complete derivation tree or with a tree containing one of the errors E1, E2, E3, or E4. 2 6 Discussion One of the most desirable properties of a programming language implementation is that it guarantee the safe execution of programs. This means that a program's execution is always faithful to the language's semantics, even if the program is erroneous. C is, of course, a notoriously unsafe language: in typical implementations, pointer errors can cause a running C program to overwrite its runtime stack, resulting in arbitrarily bizarre behavior. Sometimes this results in a \Segmentation fault|core dumped" message (though this may occur far after the original error); worse, at other times the program appears to run successfully, even though the results are entirely invalid. Three techniques can be used to provide safe execution: (i) The language can be designed so that some errors are impossible. For example, a language can de ne default initializations for variables, thereby preventing uninitialized variable errors. 25 (ii) The language can perform compile-time checks, such as type checks, toguard against other errors.(iii) Finally, runtime checks can be used to catch other errors.In the case of Polymorphic C, the Type Soundness theorem (Corollary 17)speci es exactly what runtime checks are needed to guarantee safe execution.The trouble is, except for error E4 (declaring an array of size 0 or less), typicalC implementations do not make these checks. What would we expect, then,of implementations of Polymorphic C? Well, it is actually not too di cult tocheck for error E2 (reading or writing an address with an invalid o set)|foreach pointer, we must maintain at runtime the range of permissible o sets.And error E3 (reading an uninitialized address) can also be checked fairlye ciently, by initializing array cells with a special uninit value. That leavesonly error E1 (reading or writing a dead address). This, of course, is verydi cult to check e ciently. In our natural semantics, we make this checkpossible by never reusing any cells!Hence we reach a point of trade-o s. We can directly implement our naturalsemantics, getting a safe but ine cient \debugging" implementation of Poly-morphic C. Or we can follow usual C practice and build a stack-based imple-mentation that leaves errors E1 (and perhaps E2 and E3 as well) unchecked,achieving e ciency at the expense of safety. 4 In this case, the Type Soundnesstheorem at least tells us what kinds of errors we need to look for in debuggingour programs. As a nal alternative, we can change the semantics of Poly-morphic C by giving cells unbounded lifetimes (thereby necessitating garbagecollection), as was done in the design of Java [1].7 ConclusionAdvanced polymorphic type systems have come to play a central role in theworld of functional programming, but so far have had little impact on tradi-tional imperative programming. We assert that an ML-style polymorphic typesystem can be applied fruitfully to a \real-world" language like C, bringing toit both the expressiveness of polymorphism as well as a rigorous characteriza-tion of the behavior of well-typed programs.Future work on Polymorphic C includes the development of e cient imple-mentations of polymorphism (perhaps using the work of [13,18,10]) and theextension of the language to include other features of C, especially structures.4 More precisely, allocating variables and arrays on a stack in Polymorphic C (or inany language with & or that uni es arrays and pointers) causes the type preservationproperty to fail.26 References[1] Ken Arnold and James Gosling. The Java Programming Language. Addison-Wesley, 1996.[2] Luis Damas and Robin Milner. Principal type-schemes for functional programs.In Proceedings of the 9th ACM Symposium on Principles of ProgrammingLanguages, pages 207{212, New York, 1982. ACM.[3] Pascal Fradet, Ronan Gaugne, and Daniel Le Metayer. Static detection ofpointer errors: An axiomatisation and a checking algorithm. In Proceedings ofthe 6th European Symposium on Programming, volume 1058 of Lecture Notesin Computer Science, pages 125{140, Berlin, April 1996. Springer-Verlag.[4] Michael Gordon, Robin Milner, and Christopher Wadsworth. Edinburgh LCF,volume 78 of Lecture Notes in Computer Science. Springer-Verlag, Berlin, 1979.[5] John Greiner. Standard ML weak polymorphism can be sound. TechnicalReport CMU-CS-93-160, School of Computer Science, Carnegie Mellon Univ.,Pittsburgh, Pa., May 1993.[6] Carl Gunter. Semantics of Programming Languages: Structures and Techniques.MIT Press, 1992.[7] Carl Gunter and Didier Remy. A proof-theoretic assessment of runtime typeerrors. Technical Report 11261-921230-43TM, AT&T; Bell Laboratories, 1993.[8] Robert Harper. A simpli ed account of polymorphic references. InformationProcessing Letters, 51:201{206, August 1994.[9] Robert Harper. A note on \A simpli ed account of polymorphic references".Information Processing Letters, 57:15{16, January 1996.[10] Robert Harper and Greg Morrisett. Compiling polymorphism using intensionaltype analysis. In Proceedings of the 22nd ACM Symposium on Principles ofProgramming Languages, pages 130{141, New York, 1995. ACM.[11] My Hoang, John Mitchell, and Ramesh Viswanathan. Standard ML/NJ weakpolymorphism and imperative constructs. In Proceedings of the 8th IEEESymposium on Logic in Computer Science, pages 15{25, New York, 1993. IEEE.[12] Brian Kernighan and Dennis Ritchie. The C Programming Language. Prentice-Hall, 1978.[13] Xavier Leroy. Unboxed objects and polymorphic typing. In Proceedings of the19th ACM Symposium on Principles of Programming Languages, pages 177{188, New York, 1992. ACM.[14] Xavier Leroy and Pierre Weis. Polymorphic type inference and assignment.In Proceedings of the 18th ACM Symposium on Principles of ProgrammingLanguages, pages 291{302, New York, 1991. ACM.27 [15] Robin Milner, Mads Tofte, and Robert Harper. The De nition of Standard ML.MIT Press, 1990.[16] Mustafa Ozgen. A type inference algorithm and transition semantics forPolymorphic C. Master's thesis, Department of Computer Science, NavalPostgraduate School, Monterey, CA, September 1996.[17] John C. Reynolds. The essence of ALGOL. In de Bakker and van Vliet,editors, Algorithmic Languages, pages 345{372. IFIP, North-Holland PublishingCompany, 1981.[18] Zhong Shao and Andrew Appel. A typed-based compiler for Standard ML. InProceedings of the ACM SIGPLAN '95 Conference on Programming LanguageDesign and Implementation, pages 116{129, 1995.[19] Geo rey Smith and Dennis Volpano. Polymorphic typing of variables andreferences. ACM Transactions on Programming Languages and Systems,18(3):254{267, May 1996.[20] Geo rey Smith and Dennis Volpano. Towards an ML-style polymorphic typesystem for C. In Proceedings of the 6th European Symposium on Programming,volume 1058 of Lecture Notes in Computer Science, pages 341{355, Berlin, April1996. Springer-Verlag.[21] Mads Tofte. Type inference for polymorphic references. Information andComputation, 89:1{34, 1990.[22] Dennis Volpano and Geo rey Smith. A type soundness proof for variables inLCF ML. Information Processing Letters, 56:141{146, 1995.[23] Dennis Volpano and Geo rey Smith. Eliminating covert ows with minimumtypings. In Proc. 10th IEEE Computer Security Foundations Workshop, pages156{168. IEEE, June 1997.[24] Andrew Wright. Simple imperative polymorphism. Lisp and SymbolicComputation, 8(4):343{356, December 1995.[25] Andrew Wright and Matthias Felleisen. A syntactic approach to typesoundness. Information and Computation, 115(1):38{94, November 1994.28

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Percentage of Consonants Correct for 3-5 Years Old Kurdish-Speaking Children With Middle Kurmanji-Mukryani Dialect

Objectives: The present research aims to study the normal development of Percentage of Consonant Correct (PCC) in Kurdish-speaking children, with Middle Kurmanji-Mukryani Dialect as an Articulation Competency Index (ACI). PCC was examined in terms of the manner of articulation and position of sound in the word.  Methods: In this descriptoanalytical cross-sectional study, 120 Kurdish-speak...

متن کامل

The Status of [h] and [ʔ] in the Sistani Dialect of Miyankangi

The purpose of this article is to determine the phonemic status of [h] and [ʔ] in the Sistani dialect of Miyankangi. Auditory tests applied to the relevant data show that [ʔ] occurs mainly in word-initial position, where it stands in free variation with Ø. The only place where [h] is heard is in Arabic and Persian loanwords, and only in the pronunciation of some speakers who are educated and/or...

متن کامل

Towards an ML-Style Polymorphic Type System for C

Advanced polymorphic type systems have come to play an important role in the world of functional programming. But, curiously, these type systems have so far had little impact upon widely-used imperative programming languages like C and C++. We show that ML-style polymorphism can be integrated smoothly into a dialect of C, which we call Polymorphic C. It has the same pointer operations as C, inc...

متن کامل

A Study of Inflectional Categories of Noun in Sistani Dialect

The present article aims to provide a synchronic study of the inflectional or morpho-syntactic categories of noun in Sistani dialect. These categories comprise person, number, gender or noun class, definiteness, case, and possession. Linguistic data was collected via recording free speech, and interviewing with 30 (15 females, 15 males) illiterate Sistani language consultants of age 40–102 year...

متن کامل

The Use of the Almeida-Braun System in the Measurement of Dutch Dialect Distances

Measuring dialect distances can be based on the comparison of words, and the comparison words should be based on the comparison of sounds. In this research we used an adjusted version of an articulation-based system, developed by Almeida and Braun (1986) for finding sound distances, using the IPA system. For comparison of two pronunciations of a word corresponding with two different varieties, ...

متن کامل

Influence of Aging Temperature on Mechanical Properties and Sound Velocity in Maraging Steel M350

In the present work, the influence of aging temperature on mechanical properties and sound velocity of Maraging steel M350 was investigated. For this purpose, first, samples were solution annealed at 825◦C for 2 hours and then age hardened at 510◦C-600◦C for 3 hours. Hardness, tensile and impact tests were used for determining mechanical properties and longitudinal ultrasonic velocity was used ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Sci. Comput. Program.

دوره 32 شماره

صفحات -

تاریخ انتشار 1998

A Sound Polymorphic Type System for a Dialect of C

نویسندگان

چکیده

منابع مشابه

Percentage of Consonants Correct for 3-5 Years Old Kurdish-Speaking Children With Middle Kurmanji-Mukryani Dialect

The Status of [h] and [ʔ] in the Sistani Dialect of Miyankangi

Towards an ML-Style Polymorphic Type System for C

A Study of Inflectional Categories of Noun in Sistani Dialect

The Use of the Almeida-Braun System in the Measurement of Dutch Dialect Distances

Influence of Aging Temperature on Mechanical Properties and Sound Velocity in Maraging Steel M350

عنوان ژورنال:

اشتراک گذاری